Symbiotic filtering for spam email detection

نویسندگان

  • Clotilde Lopes
  • Paulo Cortez
  • Pedro Nuno Miranda de Sousa
  • Miguel Rocha
  • Miguel Rio
چکیده

This paper presents a novel spam filtering technique called Symbiotic Filtering (SF) that aggregates distinct local filters from several users to improve the overall performance of spam detection. SF is an hybrid approach combining some features from both Collaborative (CF) and Content-Based Filtering (CBF). It allows for the use of social networks to personalize and tailor the set of filters that serve as input to the filtering. A comparison is performed against the commonly used Naive Bayes CBF algorithm. Several experiments were held with the well-known Enron data, under both fixed and incremental symbiotic groups. We show that our system is competitive in performance and is robust against both dictionary and focused contamination attacks. Moreover, it can be implemented and deployed with few effort and low communication costs, while assuring privacy.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Symbiotic Spam E-mail Filtering

This position paper discusses the use of symbiotic filtering, a novel distributed data mining approach that combines contentbased and collaborative filtering for spam detection.

متن کامل

Evolutionary Symbiotic Feature Selection for Email Spam Detection

This work presents a symbiotic filtering approach enabling the exchange of relevant word features among different users in order to improve local anti-spam filters. The local spam filtering is based on a ContentBased Filtering strategy, where word frequencies are fed into a Naive Bayes learner. Several Evolutionary Algorithms are explored for feature selection, including the proposed symbiotic ...

متن کامل

Email Spam Detection: a Symbiotic Feature Selection Approach Fostered by Evolutionary Computation

The electronic mail (email) is nowadays an essential communication service being widely used by most Internet users. One of the main problems affecting this service is the proliferation of unsolicited messages (usually denoted by spam) which, despite the efforts made by the research community, still remains as an inherent problem affecting this Internet service. In this perspective, this work p...

متن کامل

A Novel Hybrid Approach for Email Spam Detection based on Scatter Search Algorithm and K-Nearest Neighbors

Because cyberspace and Internet predominate in the life of users, in addition to business opportunities and time reductions, threats like information theft, penetration into systems, etc. are included in the field of hardware and software. Security is the top priority to prevent a cyber-attack that users should initially be detecting the type of attacks because virtual environments are not moni...

متن کامل

Feature Weight Optimization Mechanism for Email Spam Detection based on Two-Step Clustering Algorithm and Logistic Regression Method

This research proposed an improved filtering spam technique for suspected emails, messages based on feature weight and the combination of two-step clustering and logistic regression algorithm. Unique, important features are used as the optimum input for a hybrid proposed approach. This study adopted a spam detector model based on distance measure and threshold value. The aim of this model was t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Expert Syst. Appl.

دوره 38  شماره 

صفحات  -

تاریخ انتشار 2011